Goto

Collaborating Authors

 mobile phone data


Reconstructing Human Mobility Pattern: A Semi-Supervised Approach for Cross-Dataset Transfer Learning

Liao, Xishun, Liu, Yifan, Kuai, Chenchen, Ma, Haoxuan, He, Yueshuai, Cao, Shangqing, Stanford, Chris, Ma, Jiaqi

arXiv.org Artificial Intelligence

Chris Stanford, Ph.D. Novateur Research Solutions 20110 Ashbrook Place, STE 170, Ashburn, VA 20147 cstanford@novateur.ai Submission Date: October 8, 2024 Liao, Liu, Kuai, Ma, He, Cao, Stanford, and Ma 3 ABSTRACT Understanding human mobility patterns is crucial for urban planning, transportation management, and public health. This study tackles two primary challenges in the field: the reliance on trajectory data, which often fails to capture the semantic interdependencies of activities, and the inherent incompleteness of real-world trajectory data. We have developed a model that reconstructs and learns human mobility patterns by focusing on semantic activity chains. We introduce a semisupervised iterative transfer learning algorithm to adapt models to diverse geographical contexts and address data scarcity. Our model is validated using comprehensive datasets from the United States, where it effectively reconstructs activity chains and generates high-quality synthetic mobility data, achieving a low Jensen-Shannon Divergence (JSD) value of 0.001, indicating a close similarity between synthetic and real data. Additionally, sparse GPS data from Egypt is used to evaluate the transfer learning algorithm, demonstrating successful adaptation of US mobility patterns to Egyptian contexts, achieving a 64% of increase in similarity, i.e., a JSD reduction from 0.09 to 0.03. This mobility reconstruction model and the associated transfer learning algorithm show significant potential for global human mobility modeling studies, enabling policymakers and researchers to design more effective and culturally tailored transportation solutions. Keywords: Human Mobility Patterns Modeling, Transfer Learning, Semi-Supervised Learning, Synthetic Mobility Data Liao, Liu, Kuai, Ma, He, Cao, Stanford, and Ma 4 INTRODUCTION Understanding human mobility patterns has become increasingly crucial in various fields, including urban planning, transportation management (1, 2), and public health (3). As urbanization accelerates and population mobility increases, the ability to accurately comprehend and predict human activity patterns has gained paramount importance. This knowledge not only aids in optimizing urban resource allocation but also provides essential insights for the development of smart cities.


Modeling Large-Scale Walking and Cycling Networks: A Machine Learning Approach Using Mobile Phone and Crowdsourced Data

Saberi, Meead, Lilasathapornkit, Tanapon

arXiv.org Artificial Intelligence

Walking and cycling are known to bring substantial health, environmental, and economic advantages. However, the development of evidence-based active transportation planning and policies has been impeded by significant data limitations, such as biases in crowdsourced data and representativeness issues of mobile phone data. In this study, we develop and apply a machine learning based modeling approach for estimating daily walking and cycling volumes across a large-scale regional network in New South Wales, Australia that includes 188,999 walking links and 114,885 cycling links. The modeling methodology leverages crowdsourced and mobile phone data as well as a range of other datasets on population, land use, topography, climate, etc. The study discusses the unique challenges and limitations related to all three aspects of model training, testing, and inference given the large geographical extent of the modeled networks and relative scarcity of observed walking and cycling count data. The study also proposes a new technique to identify model estimate outliers and to mitigate their impact. Overall, the study provides a valuable resource for transportation modelers, policymakers and urban planners seeking to enhance active transportation infrastructure planning and policies with advanced emerging data-driven modeling methodologies.


Explainability in Practice: Estimating Electrification Rates from Mobile Phone Data in Senegal

State, Laura, Salat, Hadrien, Rubrichi, Stefania, Smoreda, Zbigniew

arXiv.org Artificial Intelligence

Explainable artificial intelligence (XAI) provides explanations for not interpretable machine learning (ML) models. While many technical approaches exist, there is a lack of validation of these techniques on real-world datasets. In this work, we present a use-case of XAI: an ML model which is trained to estimate electrification rates based on mobile phone data in Senegal. The data originate from the Data for Development challenge by Orange in 2014/15. We apply two model-agnostic, local explanation techniques and find that while the model can be verified, it is biased with respect to the population density. We conclude our paper by pointing to the two main challenges we encountered during our work: data processing and model design that might be restricted by currently available XAI methods, and the importance of domain knowledge to interpret explanations.


Researchers build ML models to forecast food insecurity

#artificialintelligence

An international team of researchers have built a set of machine learning models they say can help predict global food shortages in the near future, helping governments and international agencies understand where they can best help. Scientists from the World Food Programme, University of London Mathematics Department and Central European University Department of Network and Data Science, made use of a "unique global dataset" to build machine learning models that can explain up to 81 percent of the variation in insufficient food consumption. The study claims the machine learning models draw from indirect data sources in areas such as food prices, macro-economic indicators (including GDP), weather, conflict, prevalence of undernourishment, population density, and previous food insecurity trends. The aim is to create near-term forecasts, or "nowcasts." "We show that the proposed models can nowcast the food security situation in near real-time and propose a method to identify which variables are driving the changes observed in predicted trends -- which is key to make predictions serviceable to decision-makers," the research paper published in Nature Food this week said. The outputs of the ML models have been used to create a world map including near-term food insecurity forecasts called HungerMap.


COROID: A Crowdsourcing-based Companion Drones to Tackle Current and Future Pandemics

Rauniyar, Ashish, Hagos, Desta Haileselassie, Jha, Debesh, Håkegård, Jan Erik

arXiv.org Artificial Intelligence

Due to the current COVID-19 virus, which has already been declared a pandemic by the World Health Organization (WHO), we are witnessing the greatest pandemic of the decade. Millions of people are being infected, resulting in thousands of deaths every day across the globe. Even it was difficult for the best healthcare-providing countries could not handle the pandemic because of the strain of treating thousands of patients at a time. The count of infections and deaths is increasing at an alarming rate because of the spread of the virus. We believe that innovative technologies could help reduce pandemics to a certain extent until we find a definite solution from the medical field to handle and treat such pandemic situations. Technology innovation has the potential to introduce new technologies that could support people and society during these difficult times. Therefore, this paper proposes the idea of using drones as a companion to tackle current and future pandemics. Our COROID drone is based on the principle of crowdsourcing sensors data of the public's smart devices, which can correlate the reading of the infrared cameras equipped on the COROID drones. To the best of our knowledge, this concept has yet to be investigated either as a concept or as a product. Therefore, we believe that the COROID drone is innovative and has a huge potential to tackle COVID-19 and future pandemics.


Mining Human Mobility Data to Discover Locations and Habits

Andrade, Thiago, Cancela, Brais, Gama, João

arXiv.org Machine Learning

Many aspects of life are associated with places of human mobility patterns and nowadays we are facing an increase in the pervasiveness of mobile devices these individuals carry. Positioning technologies that serve these devices such as the cellular antenna (GSM networks), global navigation satellite systems (GPS), and more recently the WiFi positioning system (WPS) provide large amounts of spatio-temporal data in a continuous way. Therefore, detecting significant places and the frequency of movements between them is fundamental to understand human behavior. In this paper, we propose a method for discovering user habits without any a priori or external knowledge by introducing a density-based clustering for spatio-temporal data to identify meaningful places and by applying a Gaussian Mixture Model (GMM) over the set of meaningful places to identify the representations of individual habits. To evaluate the proposed method we use two real-world datasets. One dataset contains high-density GPS data and the other one contains GSM mobile phone data in a coarse representation. The results show that the proposed method is suitable for this task as many unique habits were identified. This can be used for understanding users' behavior and to draw their characterizing profiles having a panorama of the mobility patterns from the data.


Predicting customer's gender and age depending on mobile phone data

AlZuabi, Ibrahim Mousa, Jafar, Assef, Aljoumaa, Kadan

arXiv.org Machine Learning

In the age of data driven solution, the customer demographic attributes, such as gender and age, play a core role that may enable companies to enhance the offers of their services and target the right customer in the right time and place. In the marketing campaign, the companies want to target the real user of the GSM (global system for mobile communications), not the line owner. Where sometimes they may not be the same. This work proposes a method that predicts users' gender and age based on their behavior, services and contract information. We used call detail records (CDRs), customer relationship management (CRM) and billing information as a data source to analyze telecom customer behavior, and applied different types of machine learning algorithms to provide marketing campaigns with more accurate information about customer demographic attributes. This model is built using reliable data set of 18,000 users provided by SyriaTel Telecom Company, for training and testing. The model applied by using big data technology and achieved 85.6% accuracy in terms of user gender prediction and 65.5% of user age prediction. The main contribution of this work is the improvement in the accuracy in terms of user gender prediction and user age prediction based on mobile phone data and end-to-end solution that approaches customer data from multiple aspects in the telecom domain.


A Machine Learning based Robust Prediction Model for Real-life Mobile Phone Data

Sarker, Iqbal H.

arXiv.org Machine Learning

Real-life mobile phone data may contain noisy instances, which is a fundamental issue for building a prediction model with many potential negative consequences. The complexity of the inferred model may increase, may arise overfitting problem, and thereby the overall prediction accuracy of the model may decrease. In this paper, we address these issues and present a robust prediction model for real-life mobile phone data of individual users, in order to improve the prediction accuracy of the model. In our robust model, we first effectively identify and eliminate the noisy instances from the training dataset by determining a dynamic noise threshold using naive Bayes classifier and laplace estimator, which may differ from user-to-user according to their unique behavioral patterns. After that, we employ the most popular rule-based machine learning classification technique, i.e., decision tree, on the noise-free quality dataset to build the prediction model. Experimental results on the real-life mobile phone datasets (e.g., phone call log) of individual mobile phone users, show the effectiveness of our robust model in terms of precision, recall and f-measure.


CD-CNN: A Partially Supervised Cross-Domain Deep Learning Model for Urban Resident Recognition

Wang, Jingyuan (Beihang University) | He, Xu (Beihang University) | Wang, Ze (Beihang University) | Wu, Junjie (Beihang University) | Yuan, Nicholas Jing (Microsoft Corporation) | Xie, Xing (Microsoft Research) | Xiong, Zhang (Research Institute of Beihang University in Shenzhen)

AAAI Conferences

Driven by the wave of urbanization in recent decades, the research topic about migrant behavior analysis draws great attention from both academia and the government. Nevertheless, subject to the cost of data collection and the lack of modeling methods, most of existing studies use only questionnaire surveys with sparse samples and non-individual level statistical data to achieve coarse-grained studies of migrant behaviors. In this paper, a partially supervised cross-domain deep learning model named CD-CNN is proposed for migrant/native recognition using mobile phone signaling data as behavioral features and questionnaire survey data as incomplete labels. Specifically, CD-CNN features in decomposing the mobile data into location domain and communication domain, and adopts a joint learning framework that combines two convolutional neural networks with a feature balancing scheme. Moreover, CD-CNN employs a three-step algorithm for training, in which the co-training step is of great value to partially supervised cross-domain learning. Comparative experiments on the city Wuxi demonstrate the high predictive power of CD-CNN. Two interesting applications further highlight the ability of CD-CNN for in-depth migrant behavioral analysis.


Humanitarian efforts could be aided by AI

#artificialintelligence

Researchers have developed an AI algorithm to accurately predict the gender of pre-paid mobile phone users, which could be useful in crises. Phone tracking technology is already used to locate those in need of aid in humanitarian crises; but the latest development could help further, for example by identifying vulnerable groups such as women with potentially young children. Mobile phones are one of the fastest growing technologies in the developing world with global penetration rates reaching 90 per cent. However, the fact that most phones in developing countries are pre-paid means that the data lacks key information about the person carrying the phone, including gender and other demographic data, which could be useful in a crisis. Now, a team of researchers from the Data Science Institute and Department of Computing at Imperial College London have developed a machine learning algorithm, which is a form of AI.